翻訳と辞書 |
Compression of Genomic Re-Sequencing Data : ウィキペディア英語版 | Compression of Genomic Re-Sequencing Data High-throughput sequencing technologies have led to a dramatic decline of genome sequencing costs and to an astonishingly rapid accumulation of genomic data. These technologies are enabling ambitious genome sequencing endeavours, such as the 1000 Genomes Project and 1001 (''Arabidopsis thaliana'') Genomes Project. The storage and transfer of the tremendous amount of genomic data have become a mainstream problem, motivating the development of high-performance compression tools designed specifically for genomic data. A recent surge of interest in the development of novel algorithms and tools for storing and managing genomic re-sequencing data emphasizes the growing demand for efficient methods for genomic data compression. == General Concepts == While standard data compression tools (e.g., zip and rar) are being used to compress sequence data (e.g., GenBank flat files), this approach has been criticized to be extravagant because genomic sequences often contain repetitive content (e.g., microsatellite sequences) or many sequences exhibit high levels of similarity (e.g., multiple genome sequences from the same species). Additionally, the statistical and information-theoretic properties of genomic sequences can potentially be exploited for compressing sequencing data.〔Giancarlo, R., D. Scaturro, and F. Utro. 2009. Textual data compression in computational biology: a synopsis. ''Bioinformatics'' 25(13): 1575-1586.〕〔Nalbantoglu, Ö. U., D. J. Russell, and K. Sayood. 2010. Data compression concepts and algorithms and their applications to bioinformatics. ''Entropy'' 12(1): 34-52.〕
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Compression of Genomic Re-Sequencing Data」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|